cWINNOWER Algorithm for Finding Fuzzy DNA Motifs

نویسنده

  • Shoudan Liang
چکیده

The cWINNOWER algorithm detects fuzzy motifs in DNA sequences rich in protein-binding signals. A signal is defined as any short nucleotide pattern having up to d mutations differing from a motif of length l. The algorithm finds such motifs if multiple mutated copies of the motif (i.e., the signals) are present in the DNA sequence in sufficient abundance. The cWINNOWER algorithm substantially improves the sensitivity of the winnower method of Pevzner and Sze by imposing a consensus constraint, enabling it to detect much weaker signals. We studied the minimum number of detectable motifs qc as a function of sequence length N for random sequences. We found that q(c) increases linearly with N for a fast version of the algorithm based on counting three-member sub-cliques. Imposing consensus constraints reduces q(c) by a factor of three in this case, which makes the algorithm dramatically more sensitive. Our most sensitive algorithm, which counts four-member sub-cliques, needs a minimum of only 13 signals to detect motifs in a sequence of length N = 12,000 for (l,d) = (15,4).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Multiobjective Evolutionary Fuzzy System for Promoter Discovery in E. coli

In this contribution, the biological problem of extracting promoters (composed of two nucleotide sequences, TTGACA and TATAAT, separated by among 15 and 22 pairs of bases) from E. coli DNA sequences is tackled. Classical approaches for this problem, based on considering probabilistic models of the promoter motifs, fail at performing accurate predictions due to the difficulty of properly integra...

متن کامل

Protein motif extraction with neuro-fuzzy optimization

MOTIVATION It is attempted to improve the speed and flexibility of protein motif identification. The proposed algorithm is able to extract both rigid and flexible protein motifs. RESULTS In this work, we present a new algorithm for extracting the consensus pattern, or motif, from a group of related protein sequences. This algorithm involves a statistical method to find short patterns with hig...

متن کامل

Finding the Optimal Path to Restoration Loads of Power Distribution Network by Hybrid GA-BCO Algorithms Under Fault and Fuzzy Objective Functions with Load Variations

In this paper proposes a fuzzy multi-objective hybrid Genetic and Bee colony optimization algorithm(GA-BCO) to find the optimal restoration of loads of power distribution network under fault.Restoration of distribution systems is a complex combinatorial optimization problem that should beefficiently restored in reasonable time. To improve the efficiency of restoration and facilitate theactivity...

متن کامل

An Algorithm to Obtain Possibly Critical Paths in Imprecise Project Networks

  We consider criticality in project networks having imprecise activity duration times. It is well known that finding all possibly critical paths of an imprecise project network is an NP-hard problem. Here, based on a method for finding critical paths of crisp networks by using only the forward recursion of critical path method, for the first time an algorithm is proposed which can find all pos...

متن کامل

Functional motifs in Escherichia coli NC101

Escherichia coli (E. coli) bacteria can damage DNA of the gut lining cells and may encourage the development of colon cancer according to recent reports. Genetic switches are specific sequence motifs and many of them are drug targets. It is interesting to know motifs and their location in sequences. At the present study, Gibbs sampler algorithm was used in order to predict and find functional m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Proceedings. IEEE Computer Society Bioinformatics Conference

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2003